Cascading K-means Clustering and K-Nearest Neighbor Classifier for Categorization of Diabetic Patients (IJEAT)
نویسندگان
چکیده
Medical Data mining is the process of extracting hidden patterns from medical data. This paper presents the development of a hybrid model for classifying Pima Indian diabetic database (PIDD). The model consists of three stages. In the first stage, K-means clustering is used to identify and eliminate incorrectly classified instances. In the second stage Genetic algorithm (GA) and Correlation based feature selection (CFS) is used in a cascaded fashion for relevant feature extraction, where GA rendered global search of attributes with fitness evaluation effected by CFS. Finally in the third stage a fine tuned classification is done using K-nearest neighbor (KNN) by taking the correctly clustered instance of first stage and with feature subset identified in the second stage as inputs for the KNN. Experimental results signify the cascaded K-means clustering and KNN along with feature subset identified GA_CFS has enhanced classification accuracy of KNN. The proposed model obtained the classification accuracy of 96.68% for diabetic dataset.
منابع مشابه
FUZZY K-NEAREST NEIGHBOR METHOD TO CLASSIFY DATA IN A CLOSED AREA
Clustering of objects is an important area of research and application in variety of fields. In this paper we present a good technique for data clustering and application of this Technique for data clustering in a closed area. We compare this method with K-nearest neighbor and K-means.
متن کاملImproving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering
Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...
متن کاملDiagnosis of Tempromandibular Disorders Using Local Binary Patterns
Background: Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment.Material and Methods: CBCT images of 66 patients (132 joints) with TMD and 66 normal...
متن کاملText Recognition with k-means Clustering
A thesaurus is a reference work that lists words grouped together according to similarity of meaning (containing synonyms and sometimes antonyms), in contrast to a dictionary, which contains definitions and pronunciations. This paper proposes an innovative approach to improve the classification performance of Persian texts considering a very large thesaurus. The paper proposes a flexible method...
متن کاملComparing pixel-based and object-based algorithms for classifying land use of arid basins (Case study: Mokhtaran Basin, Iran)
In this research, two techniques of pixel-based and object-based image analysis were investigated and compared for providing land use map in arid basin of Mokhtaran, Birjand. Using Landsat satellite imagery in 2015, the classification of land use was performed with three object-based algorithms of supervised fuzzy-maximum likelihood, maximum likelihood, and K-nearest neighbor. Nine combinations...
متن کامل